Print the 6 first lines of the R-built-in data.frame
trees
## Girth Height Volume
## 1 8.3 70 10.3
## 2 8.6 65 10.3
## 3 8.8 63 10.2
## 4 10.5 72 16.4
## 5 10.7 81 18.8
## 6 10.8 83 19.7
Print only the column names
## [1] "Girth" "Height" "Volume"
What is the dimension of
trees?
## [1] 31 3
Plot the trees height and volume as a function of their girth in two different graphs. Make sure the axis labels are clear
In each graph, add a red dashed line corresponding to the relevant correlation that you observe (average value, linear correlation…)
plot(trees$Girth, trees$Height, xlab='Girth', ylab='Height')
abline(col='red',lty=2,h=mean(trees$Height))plot(trees$Girth, trees$Volume, xlab='Girth', ylab='Volume')
fit <- lm(trees$Volume~trees$Girth)
abline(col='red',lty=2, coef(fit))ggplot(data=trees, aes(Girth, Height))+
geom_point()+geom_hline(col='red',yintercept=mean(trees$Height),lty=2)Explain your choice and write the corresponding values (average value and standard deviation, or slope, intercept and corresponding errors). Round the values to 2 decimals.
The average height is 76 with standard deviation 6.37.
The volume evolves with a slope 5.07 ± 0.25 and intercept -36.94 ± 3.37.
Print the 3 first lines of the R-built-in data.frame
USArrests. This data set contains statistics about violent crime rates by US state. The numbers are given per 100 000 inhabitants, except forUrbanPopwhich is a percentage.
## Murder Assault UrbanPop Rape
## Alabama 13.2 236 58 21.2
## Alaska 10.0 263 48 44.5
## Arizona 8.1 294 80 31.0
What is the average murder rate in the whole country?
## [1] 7.788
What is the state with the highest assault rate?
## [1] "North Carolina"
Create a subset of
USArrestsgathering the data for states with an urban population above (including) 80%.
How many states does that correspond to?
## [1] 12
Within these states, what is the state with the smallest rape rate?
## [1] "Rhode Island"
Print this subset ordered by decreasing urban population.
## Murder Assault UrbanPop Rape
## California 9.0 276 91 40.6
## New Jersey 7.4 159 89 18.8
## Rhode Island 3.4 174 87 8.3
## New York 11.1 254 86 26.1
## Massachusetts 4.4 149 85 16.3
## Hawaii 5.3 46 83 20.2
## Illinois 10.4 249 83 24.0
## Nevada 12.2 252 81 46.0
## Arizona 8.1 294 80 31.0
## Florida 15.4 335 80 31.9
## Texas 12.7 201 80 25.5
## Utah 3.2 120 80 22.9
Print this subset ordered by decreasing urban population and increasing murder rate.
## Murder Assault UrbanPop Rape
## California 9.0 276 91 40.6
## New Jersey 7.4 159 89 18.8
## Rhode Island 3.4 174 87 8.3
## New York 11.1 254 86 26.1
## Massachusetts 4.4 149 85 16.3
## Hawaii 5.3 46 83 20.2
## Illinois 10.4 249 83 24.0
## Nevada 12.2 252 81 46.0
## Utah 3.2 120 80 22.9
## Arizona 8.1 294 80 31.0
## Texas 12.7 201 80 25.5
## Florida 15.4 335 80 31.9
Plot an histogram of the percentage of urban population with a binning of 5%. Add a vertical red line marking the average value. Make sure the x axis shows the [0,100] range.
ggplot(data=USArrests, aes(x=UrbanPop))+
geom_histogram(breaks=seq(0,100,5),color="black", alpha=.2)+
geom_vline(xintercept = mean(USArrests$UrbanPop), col='red')Is there a correlation between the percentage of urban population and the various violent crime rates? argument your answer with plots.
par(mfrow=c(2,2),mar=c(4,4,1,1))
plot(USArrests$UrbanPop,USArrests$Murder, pch=16)
plot(USArrests$UrbanPop,USArrests$Assault, pch=16)
plot(USArrests$UrbanPop,USArrests$Rape, pch=16)p1 <- ggplot(data=USArrests, aes(x=UrbanPop, y=Murder))+geom_point()
p2 <- ggplot(data=USArrests, aes(x=UrbanPop, y=Assault))+geom_point()
p3 <- ggplot(data=USArrests, aes(x=UrbanPop, y=Rape))+geom_point()
plot_grid(p1,p2,p3,ncol=2)No clear correlation appears.
In high-pressure experiments, the pressure in the Diamond Anvil Cell (DAC) is calibrated through the measure of the Raman shift of a tiny ruby crystal placed in the pressure transmitting medium next to the measured sample.
Write a function returning the pressure \(P\) as a function of the ruby Raman shift position \(\omega\) and the excitation laser wavelength \(\lambda_l\): \[P(\omega, \lambda_l) = \frac{A}{B}\left[\left(\frac{\lambda}{\lambda_0}\right)^B-1\right] (GPa)\] where \(A=1876\) and \(B=10.71\), \(\lambda\) is the the measured wavelength of the ruby \(R_1\) line (the most energetic one) and \(\lambda_0=694.24\) nm is the zero-pressure value at 298 K1. The relationship between the wavenumber \(\nu\) in cm\(^{-1}\) and the wavelength \(\lambda\) in nm is given by \(\nu(\text{cm}^{-1})=\frac{10^7}{\lambda(\text{nm})}\), and the Raman shift \(\omega=\Delta\nu=\nu_l-\nu=\frac{10^7}{\lambda_l}-\frac{10^7}{\lambda}\) (cm\(^{-1}\)).
P <- function(w,laser=568.189){
A <- 1876
B <- 10.71
lambda0 <- 694.24
lambda <- 1e7*laser/(1e7-w*laser)
A/B*((lambda/lambda0)^B-1)
}Write a function returning a normalized Lorentzian as a function of its center \(x_0\) and its full width at half maximum \(\Gamma\): \[L(x,x_0,\Gamma)=\frac{\Gamma}{2\pi}\frac{1}{\frac{\Gamma^2}{4}+\left(x-x_0\right)^2}\]
Store the list of files containing
rubyin their name in theData/folder into a variableflist. Print its length.
## [1] 5
Plot with points the first file in
flist. Find the position of its maximum and store it inxmax. Guess roughly the parameters needed to fit the experimental data byy0+A1*L(x,x1,FW1)+A2*L(x,x2,FW2), and add a blue line on the plot to represent this function.
d <- read.table(file.path("Data",flist[1]),
header=FALSE,
col.names=c("w","Intensity"))
x <- d[,1]
y <- d[,2]
xmax <- x[which.max(y)]
y0 <- .1;
A1 <- 100000; A2 <- 200000;
FW1 <- 10; FW2 <- 10;
x1 <- xmax-30; x2 <- xmax
plot(x,y)
lines(x,y0+A1*L(x,x1,FW1)+A2*L(x,x2,FW2), col="blue")ggplot(data=d, aes(x=w, y=Intensity))+
geom_point()+
geom_line(aes(x=w, y=y0+A1*L(w,x1,FW1)+A2*L(w,x2,FW2)), col="blue")Using
nls(), fit the first spectrum inflistbyy0+A1*L(x,x1,FW1)+A2*L(x,x2,FW2)and using the starting parameters you defined before. Plot the experimental data again and add the fitted spectrum as a red line.
fit <- nls(y~y0+A1*L(x,x1,FW1)+A2*L(x,x2,FW2),
start=list(y0=y0, A1=A1, A2=A2,
FW1=FW1,FW2=FW2,x1=x1,x2=x2)
)
plot(x,y)
lines(x, predict(fit), col="red")Based on the above procedure, for each file in
flist(so, use aforloop), fit the Raman spectrum by the sum of two Lorentzian functions, and store the fitting parameters into a data.frame calledruby_fitalso containing the names of the corresponding files. Attention: the initial guesses for amplitudes and widths can be constant, but the peaks positions should evolve for each spectrum. The difference between the two peaks is always roughly 30 cm\(^{-1}\), and the largest peak is always the most energetic one. Check that your fits are correct by printing the experimental data and the fitted result at each iteration (add the name of the file as the plot title).
ruby_fit <- data.frame()
par(mfrow=c(3,2),mar=c(4,4,1,1))
for(f in flist){#f <- flist[1]
d <- read.table(file.path("Data",f),
header=FALSE,
col.names=c("w","Intensity"))
x <- d[,1]
y <- d[,2]
xmax <- x[which.max(y)]
y0 <- .1;
A1 <- 100000; A2 <- 200000;
FW1 <- 10; FW2 <- 10;
x1 <- xmax-30; x2 <- xmax
fit <- nls(y~y0+A1*L(x,x1,FW1)+A2*L(x,x2,FW2),
start=list(y0=y0, A1=A1, A2=A2,
FW1=FW1,FW2=FW2,x1=x1,x2=x2)
)
ruby_fit <- rbind(ruby_fit,
data.frame(name=f, t(as.data.frame(coef(fit)))))
plot(x,y, main=f)
lines(x, predict(fit), col="red")
}
row.names(ruby_fit) <- 1:nrow(ruby_fit)Add a column in
ruby_fitcorresponding to the estimated pressure rounded to 1 decimal. The excitation wavelength in this experiment was 532 nm. Print the resultingruby_fittable usingknitr::kable(ruby_fit)
| name | y0 | A1 | A2 | FW1 | FW2 | x1 | x2 | P |
|---|---|---|---|---|---|---|---|---|
| SWNT_ruby_01.txt | 121.31548 | 166356.2 | 327696.3 | 11.39192 | 15.06717 | 4371.581 | 4401.312 | 1.1 |
| SWNT_ruby_02.txt | 90.62964 | 152514.6 | 292086.0 | 11.45237 | 14.76490 | 4391.899 | 4421.366 | 3.8 |
| SWNT_ruby_03.txt | 92.03126 | 182837.6 | 335173.7 | 11.53816 | 14.62646 | 4414.858 | 4444.203 | 6.8 |
| SWNT_ruby_04.txt | 319.62879 | 294710.7 | 577912.4 | 11.08481 | 14.16231 | 4444.792 | 4474.054 | 10.9 |
| SWNT_ruby_05.txt | 121.25907 | 155329.7 | 392016.5 | 13.02157 | 17.95192 | 4482.789 | 4513.105 | 16.5 |
Store all file names containing “RBM” into a variable
fRBM. Load all the corresponding spectra into a singledata.framecalledspecwith 3 columns: Raman shift \(\omega\), Intensity, Pressure. Of course, the indexes in the file names between the ruby and RBM files match. In the Intensity column, store the intensity normalized to [0,1].
fRBM <- list.files(path="Data",pattern="RBM")
spec <- data.frame()
norm01 <- function(x) {(x-min(x))/(max(x)-min(x))}
for (i in 1:length(fRBM)){
d <- read.table(file.path("Data",fRBM[i]),
header=FALSE,
col.names=c("w","Intensity"))
d$Intensity <- norm01(d$Intensity)
spec <- rbind(spec, data.frame(d,P=ruby_fit$P[i]))
}Using
ggplot2, plot with points the stacked normalized RBM band spectra vertically shifted by P, with a color for each spectrum corresponding to the pressure. Make the plot interactive.
colors <- colorRampPalette(c("royalblue","seagreen","orange","red","brown"))(length(fRBM))
p <- ggplot(data=spec, aes(x=w,y=Intensity*3+P, color=factor(P)))+
scale_colour_manual(values=colors,name="P [GPa]")+
geom_point(size=.5, alpha=.5)
p# Other solution
p <- ggplot(data=spec, aes(x=w,y=Intensity*3+P, color=P))+
scale_colour_gradientn(colors=colors,name="P [GPa]")+
geom_point(size=.5, alpha=.5)
p
ggplotly(p)